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Abstract 

We consider the problem of approximating sums of high-dimensional stationary 
time series by Gaussian vectors, using the framework of functional dependence mea¬ 
sure. The validity of the Gaussian approximation depends on the sample size n, 
the dimension p, the moment condition and the dependence of the underlying pro¬ 
cesses. We also consider an estimator for long-run covariance matrices and study 
its convergence properties. Our results allow constructing simultaneous confidence 
intervals for mean vectors of high-dimensional time series with asymptotically cor¬ 
rect coverage probabilities. A Gaussian multiplier bootstrap method is proposed. A 
simulation study indicates the quality of Gaussian approximation with different n, p 
under different moment and dependence conditions. 


1 Introduction 

During the past decade, there has been a signihcant development on high-dimensional data 
analysis with applications in many helds. In this paper we shall consider simultaneous in¬ 
ference for mean vectors of high-dimensional stationary processes, so that one can perform 
family-wise multiple testing or construct simultaneous confidence intervals, an important 
problem in the analysis of spatial-temporal processes. To £x the idea, let Xj be a station¬ 
ary process in with mean p = (/ii,..., pp)"^ and finite second moment in the sense that 
< oo. In the scalar case in which p = 1 or when p is fixed, under suitable weak 
dependence conditions, we can have the central limit theorem (CLT) 

- n oo 

^ - /i) ^ x(0, S), where S = ^ E((Xo - p){Xk - pY). (1) 

See, for example, Rosenblatt (1956), Ibragimov and Linnik (1971), Wu (2005), Dedecker 
et al. (2007) and Bradley (2007) among others. In the high dimension case in which p can 
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also diverge to infinity, Portnoy (1986) showed that the central limit theorem can fail for 
i.i.d. random vectors if y/n = o{p). In this paper we shall consider an alternative form; 
Gaussian approximation for the largest entry of the sample mean vector = n~^ 

For a vector v = (ui,... ,VpY, let |n|oo = maxj<p |nj|. Specifically, our primary goal is to 
establish the Gaussian Approximation (GA) in 

sup \¥{^/n\Xn - ^^\oo >u)- P(|Z|oo > m)| -t 0, (2) 

u>0 

where both —)■ oo. Here the Gaussian vector Z = (Zi,..., Zp)"'^ ~ A^(0, E). Gher- 

nozhukov et ah (2013a) studied the Gaussian approximation for independent random vec¬ 
tors. There has been limited research on high-dimensional inference under dependence. 
The associated statistical inference becomes considerably more challenging since the au¬ 
tocovariances with all lags should be considered. Zhang and Gheng (2014) extended the 
Gaussian approximation in Ghernozhukov et ah (2013a) to very weakly dependent random 
vectors which satisfy a uniform geometric moment contraction condition. The latter con¬ 
dition is also adopted in Ghen et ah (2015) for self-normalized sums. Ghernozhukov et ah 
(2013b) did a similar extension to strong mixing random vectors. Here we shall establish 
(7) for a wide class of high-dimensional stationary process under suitable conditions on 
the magnitudes of p, n, and the mild dependence conditions on the process {Xi). 

In Section 2 we shall introduce the framework of high-dimensional time series and 
some concepts about functional and predictive dependence measures that are useful for 
establishing an asymptotic theory. The main result for Gaussian approximation of the 
normalized mean vector and the choice of the normalization matrix is established in Section 
3. Depending on the moment and the dependence conditions, both high dimension and 
ultra high dimension cases are discussed. 

To perform statistical inference based on (7), one needs to estimate the long-run co- 
variance matrix E. The latter problem has been extensively studied in the scalar case; see 
Politis et ah (1999), Biihlmann (2002), Lahiri (2003), Alexopoulos and Goldsman (2004), 
among others. In Section 4 we study the batched-mean estimate of long-run covariance 
matrices and derive a large deviation result about quadratic forms of stationary processes. 
The latter tail probabilities inequalities allow dependent and/or non-sub-Gaussian pro¬ 
cesses under mild conditions, which is expected to be useful in other high-dimensional 
inference problems for dependent vectors. The consistency of the batched-mean estimate 
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ensures the validity of the normalized Gaussian multiplier bootstrap method. 

We provide in Section 5 some sharp inequalities for tail probabilities for dependent 
processes in both polynomial tail and exponential tail cases. Part of the proof are relegated 
to Section 6. 

We now introduce some notation. For a random variable X and g > 0, we write 
X e ii ||X||g := < oo, and for a vector v = (ui,..., Up)"'', let the norm-s 

length |n|s = s > 1. Write the p x p identity matrix as Idp. For two real 

numbers, set a; V g = max{x,y) and x Ay = min(a;,g). For two sequences of positive 
numbers {an) and {bn), we write a„ x bn (resp. an ^ bn or an <C bn) if there exists some 
constant G > 0 such that C~^ < an/bn < C (resp. an/bn < C or an/bn —t 0) for all large 
n. We use C,Ci,C 2 , • ■ ■ to denote positive constants whose values may differ from place 
to place. A constant with a symbolic subscript is used to emphasize the dependence of the 
value on the subscript. Throughout the paper, we assume p = )-ooasn—)-oo. 

2 High-dimensional Time Series 

Let ei,i G Z, be i.i.d. random variables and = (... ,ej_i,£j); let (Xj) be a stationary 
process taking values in MA that assumes the form 


W = (Wi,W2,...,X,p)^=G(X*), 


( 3 ) 


where G{-) = {gi{-), ■ ■ ■ ,gp{-))~^ is an M^-valued measurable function such that X, is well 


dehned. In the scalar case with p = 1, (3) allows a very general class of stationary processes 
(cf. Wiener (1958), Rosenblatt (1971), Priestley (1988), Tong (1990), Wu (2005), Tsay 
(2005), Wu (2011)). It includes linear processes as well as a large class of nonlinear time 
series models. Within this framework, (e*) can be viewed as independent inputs of a 
physical system and all the dependences among the outputs (Xj) result from the underlying 
data-generating mechanism G{-). The function gj{-), 1 < J < p, is the j-th coordinate 
projection of G(-). Unless otherwise specihed, assume throughout the paper that EXj = 0 
and maxj<p ||Xjj llg < oo for some q > 2. Let P(A;) = (7p(fc))fj=i = lE(XjXj+A:) ^e the 
autocovariance matrix and recall the long-run covariance matrix 


OO 



( 4 ) 


k=—oo 
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if it exists. Note that ajj = X]^-oo 1 < J ^ Pi is the long-run variance of the 

component process X.j = (Xjj)jgg. For the latter process, following Wu (2005) we define 
respectively the functional dependence and the predictive dependence measure 



( 5 ) 


where ^ ^j-i, ^'j, ^j+i, ■ ■ ■ Xi) is a coupled version of with Sj in replaced 


by and ek,s[, k,l G Z, are i.i.d. random variables, X- = {ei,ei+i,... ,ej) and Xi 


{Si, £i+i,...). Note that jf j > Xo account for the dependence in the process 

X.j, we define the dependence adjusted norm 


OO 



Due to the dependence, it may happen that ||Xjj||g < oo while ||X.j||gQ, = oo. Elementary 
calculations show that, if Xij,i G Z, are i.i.d., then ||Xp ||g < ||Xj||q^o < 2||Xjj||g, suggesting 
that the dependence adjusted norm is equivalent to the classical norm. 

To account for high-dimensionality, we define 



which can be interpreted as the uniform and the overall dependence adjusted norms of 
(Xj)jgz, respectively. The form (3) and its associated dependence measures provide a 
convenient framework for studying high-dimensional time series. Chen et ah (2013) and 
Zhang and Cheng (2014) considered some special cases: the former paper requires that 
maxi<j<p ||X.j||q^Q, < C while the latter imposes the stronger geometric moment contraction 
condition maxi<j<p < Cp^ with p G (0,1), and in both cases the constant C does 

not depend on p. Those assumptions can be fairly restrictive. In this paper Tg q, can be 
unbounded in p. Additionally, we define the functional dependence measure and its 
corresponding dependence adjusted norm for the p-dimensional stationary process (Xj) 


^i,q II Xj^jQj Ico II g, 

II |X. I oo II sup (jn T 1) 


a > 0, where flrn,q = '^^^i,q,^ A 0. 


OO 


m>0 


i=m 
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Clearly, we have < || |X|ooi|g,a < 

3 Gaussian Approximations 

In this section we shall present main results on Gaussian approximations. Theorem 3.2 
concerns the hnite polynomial moment case with both weaker and stronger temporal de¬ 
pendence. Consequently the dimension p allowed can be at most a power of n. If the 
underlying process has hnite dependence-adjusted sub-exponential norms, Theorem 3.3 
asserts that an ultra-high dimension p can be allowed. Theorem 6.4 in Section 6.1 provides 
a convergence rate of the Gaussian approximation. 

Recall (4) for the long-run covariance matrix S. Let Sq = diag(S) be the diagonal 
matrix of S, and Dq = diag((j|(^,... ,o'pp^). Assume /i = 0. We consider the following 
normalized version of ( 2 ): 

sup \F{^/n\DQ^Xn\oo >u)- F{\Dq^Z\oo > m)| -t 0, (7) 

tt >0 

Assumption 3.1. There exists a eonstant c > 0 such that niini<j<pajj > c. 

To state Theorem 3.2, we need to dehne the following quantities: Qg^a = ^q,a A 
(|||X.|ooi|g,alogp), Li = ^ (4'2,a4'2,0 (log p)^) 

^2 = ^i,Jlog(pr^))^ W 3 = (n-“(log(pn))3/20^_„)V(V2-a-iA)^ 

JVl = (n/logp)'?/V0q,a> ^2 = n(logp)"2^-2^ iVg = (logp). 

Theorem 3.2. Let Assumption 3.1 he satisfied, (i) Assume that Qq,a < 00 holds with 
some g > 4 and a > 1/2 — 1/q (the weaker dependence case), 

0 q,o’ 2 .^'^'^“^'^^(log(pn))^/^ —)■ 0 ( 8 ) 

and 

max(Li, L 2 ) max(lTi, IT 2 ) = o(l) min(Ai, N 2 ). (9) 

Then the Gaussian Approximation (7) holds, (ii) Assume 0 < a < 1/2 — 1/q (the stronger 
dependence case). Then (7) holds if Qg^aO-OgpY^"^ = o(n") and 

L 2 max(lTi, IT 2 , IL 3 ) = 0(1) min(A 2 , A 3 ). (10) 
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Remark 1. (Optimality of our result on the allowed dimension p) Assume a > 1/2 — 1/q. 
In the special case with 'hg „ x 1 and Qq^a ^ ( 8 ) becomes 

p(log(pn))^'^/^ = ( 11 ) 

which by elementary manipulations implies (9), and hence the GA (7). It turns out that 
condition ( 11 ), or equivalently p(logp)^'^/^ = is optimal up to a multiplicative 

logarithmic term. Consider the special case in which qj ^ are i.i.d. symmetric 
random variables with lE(Xj^) = 1 and the tail probability P(Xij >u) = u~^i{u), u > uq, 
where i{u) = (logu)”^. By Nagaev (1979), we have the expansion: for y > -y/n, 

P(Xii + ... + Xni >y)^ ny~'^i{y) + 1 - $(?// ^/n). ( 12 ) 

Let Mn = Xii + ... + Xni, Z = (Zi ,..., Zp)~’^ ~ A^(0, Idp) and assume 

^g/ 2 -i ^ o(p(logn)“^(logp)“^/^). (13) 

Then the Gaussian approximation (7) does not hold. To see this, let u = (21ogp)^/^. Then 
pP(|Zi| > m) —)■ 0, and, by (12) and (13), pP(M„ > ^/nu) —)■ oo. Hence PP(|M„| < 
y/nu) —)■ 0 and PP(|Zi| < n) —>■ 1 , implying that 

\nVn\Xn\oo<u)-F{\Z\^<u)\ = \W{\Mn\<^u)-W{\Z,\<u)\ 

= \[l-2F{Mn> V^u)]P-FP{\Zi\ <u)\ ^1. 

Note that (13) is equivalent to = o(p(logp)“^“'^/^), suggesting that (11) is optimal 

up to a logarithmic term. □ 

Now suppose there exist 0 < Ki < ^2 such that and Qg^a ^ and 

X n. Elementary but tedious calculations show that, in the weaker dependence case 
a > 1/2 — 1/q, if 

f K2 2ki 2 (2ki \ 1 , 

T > max ———,-h 8 ki, -h 8 fi:i + 2^2 L (14) 

11/2 - 1 /g a q \ a ) j 

then conditions in (i) of Theorem 3.2 are satished, while for the stronger dependence case 
with 0 <q;<1/2 — 1 /g, a larger sample size n is required: 

r > max { — , + 8ki, (1 — 2a) f + 8 /?!^ + 2 ^ 2 ! (15) 

y a a \ a / J 
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The lower bounds in (14) and (15) are both non-decreasing of ki, K 2 and non-increasing in 
q,a. 

Under (11), the allowed dimension p can only be at most a polynomial of n. To ensure 
the validity of GA in the ultra-high dimensional case with logp = o{n^) with some c > 0 , 
we need to consider the sub-exponential case in which has finite moment with any 
order. For z/ > 0 and a > 0, define the dependence-adjusted sub-exponential norm 

\\X.j\\p,,a = sup and = max 

q>2 q 3<P 

Let La = ((logp)^/^+i/2<h,/,^,„)^/“, X 4 = n(logp)"^" 2 // 3 $- 2 ^ ^nd IU 4 = (log(pn)) 3 + 2 // 3 $ 2 ^^^+ 
(log(pn))"^. Here {3 = 2/{l + 2u). 

Theorem 3.3. Let Assumption 3.1 he satisfied. Assume that < 00 for some > 0, 
a > 0 and 


max(L2, L3) max(hFi, H4) = o{Nfij, Lf max(hFi, H4) 


o{n). 


( 16 ) 


Then the Gaussian Approximation (1) holds. 


Proof. The proof is similar to that of Theorem 3.2, and thus is omitted. □ 

If ^ 1; then the ultra high-dimensional case with logp = o(n'^) with some c > 0 
is allowed, where specihcally we can let 


{ 1/(8 +2/a+ 2//3), 2/3</3<2 

l/[7+(l//3 + l/2)(l/a + 2)], 1/2 < 13 <2/3 . (17) 

l/[3 + 2/(3 + {1/fi + l/2)(l/a + 2)], 0 < (3 < 1/2 


3.1 Simultaneous Inference of Covariances 

Let Xi,... ,Xn be i.i.d. p-dimensional vectors with mean 0 and covariance matrix T = 
To = ( 7 jfc)jfc=i = E(XjX7). We can estimate T by the sample covariance matrix T = 
{fijkYjk=i ~ YYi=i^i^J■ To perform simultaneous inference on 7 ^^, 1 < j, A; < p, one 
needs to derive asymptotic distribution of the maximum deviation maxj_fc<p \ fijk—3jk \ or the 
normalized version maxj^fc<p \ fijk — 7jfc|/'Ufcj Equation (2) in Xiao and Wu (2013). Jiang 
(2004) established the Gumbel convergence of the maximum deviation assuming that all 
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entries of Xi are also independent. See Li and Rosalsky (2006) and Liu et al. (2008) for some 
refined results. Xiao and Wu (2013) considered the extension which allows dependence 
among entries of X^. However the latter paper requires that the vectors Xi,... ,Xn are 
i.i.d. The problem of further extension to temporally dependent Xi is open. In analyzing 
electrocorticogram data in the format of multivariate time series, Kramer et al. (2009) 
proposed to use the maximum cross correlation between time series to identify edges that 
connect the corresponding nodes in a network, suggesting that an asymptotic theory for 
maximum deviations of sample covariances is needed. 

Our Theorems 3.2 and 3.3 can be applied to the above problem of further extension 
to temporally dependent process {Xi). Let {Xi) be a mean zero p-dimensional station¬ 
ary process of form (3). To apply Theorems 3.2 and 3.3, one needs to deal with the 
key issue of computing the functional dependence measure of the p^-dimensional vector 
Xi = vec{XiXj — E(XjX7)). Interestingly, our framework allows a natural and elegant 
treatment. Let a = {j, k), j, k < p, and Xia = XijXn, - 7a, where ja = Ijk = E(XijXifc). 
By Holder’s inequality, the functional dependence of the component process {Xia)i 

^i,ql2,a ■ '^{XijXn^) Xjj |QjXj/jqg| -|- E(XjjqQ|Xj^ |qj) || 

— II g/2 

— ^ifc,{0}) IIg/2 T 2||(Xjj ^ij,{0})^ifc,{0} ||q/2 

< 2\\Xij\\q5i^q^k + ‘^\\Xik\\q5i^qJ. (18) 

Hence, we can have an upper bound of the dependence adjusted norm of (T^a) 

OO 

||^■a||g/2,a := SUp(m -M)" fc 

m>0 —^ 

— i=m 

— 2||X.jj|q^o||-^-fe||g,a + 2||X.fc||g^o||-^7llg,a- (19) 

Consequently, the uniform and the overall dependence adjusted norms of Xi are 
max ||T’.a||g/2,a < 

a 

( \ 2/5 / p \2/q/p \ 2/q 

X ii'V-iiJ/Lj ■ (2°) 

Similarly, the dependence adjusted norm for the process {Xi) can be calculated by 


With (18)-(21), conditions in Theorems 3.2 and 3.3 can be formnlated accordingly, and 
nnder those conditions we can have the following Ganssian Approximation 

snp \F{^/nmax \% - 7 a|/ra > u) - P(max \Zalra\ > u)\ -)■ 0 , ( 22 ) 

hi>0 ^ ^ 

where Z = (Za)a ~ A^(0, Sa”), is the x p^ long-run covariance matrix of (A))* and 
(r^)a is the diagonal matrix of S;f. 

4 Estimation of long-run covariance matrix 

Given the realization Xi, ... ,X„, to apply the Gaussian approximation (7), we need to 
estimate the long-run covariance matrix S. Note that S/(27r) is the value of the spectral 
density matrix of (Xj) at zero frequency. In the one-dimensional case, there is a large lit¬ 
erature concerning spectral density estimation; see for example Anderson (1971), Priestley 
(1981), Rosenblatt (1985), Brockwell and Davis (1991), Liu and Wu (2010) among others. 
In the high-dimensional setting, Ghen et ah (2013) studied the regularized estimation of 
r(0) = E(XoXq^). Assume EXj = 0. We then consider the batched mean estimate 

^ W W 

s = = (23) 

b=l b=l i&Lf, i^Lb 

where the window = {1 -|- (6 — 1)M,..., nM}, b = 1,... ,w, the window size \Lf,\ = 
M —)■ cx) and the number of blocks w = \n/M\. Theorems 4.1 and 4.2 concern the 
convergence of the above estimate for processes with hnite polynomial and hnite sub- 
exponentail dependence adjusted norms, respectively. The convergence rate depends in a 
subtle way on the temporal dependence characterized by a (cf. ( 6 )), the uniform and the 
overall dependence adjusted norms ^ and T^q,, respectively, the same size n and the 
dimension p. 

For a random variable X, we dehne the operator Eq as Eo(X) := X — EX. 

Theorem 4.1. Assume '^q^a < C )0 with q > 4 and a > 0, and M = 0{n‘") for some 
0 < <^ < 1. Let Fa = wM (resp. or i(;'?/4-a5/2j\^g/2-a(}/2j Q, ^ 1 — 2/g (resp. 

1/2 — 2/q < a < 1 — 2/q or a < 1/2 — 2/q). Then for x > we have 

F / C r 

nn\d^ag{J:)-Ed^ag{J:)U > x) < +pexp 
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^ A VF 

(n|S -ES|oo > 2:) < +p^exp 


X' 


Cq^gX^ 


( 24 ) 


for all large n, where the constants in < only depend on g, a and q. 


Proof Fix I < j,k < p; let T = Ylb=i^bjYbk: where Y^j = Z)*eL6^*i- > 0, define 

Xij^r = ..., £i), Ybj^r = ^ij,T and = X)r=i ^hj^rYb^r- We will first prove 

for any a: > 0 


P(|Eo(T-Tm)|>x)< 


x-Y'^wMY'^--Y\il + Eq^g{x), « > 1/2 - 2/g 

^ Eq^g{x), a<lf2-2/q'^ 


where the constants in < only depend on <^, a and q, and 


^q,a II W.j llg g 11^.^ Ilg Q, “1- || || g g || || g q,, 

Eq,a{x) = exp{-Cg,„(wM^“^“C 4 ,a)“^a;^}- 

Following the argument in the proof of Lemma 5.7, let L = [(logw)/(log2)J, wi = 2\ 
1 < I < L, wl = w and p = Mzui for 1 < / < L. Let ciJo = 1 and Tg = M . Write 

L 

T-Tm = T - Tmw + ^ 14,;, where 14,; = 4-, - (26) 

1=1 

By the argument in Lemma 9 of Xiao and Wu (2012), we have 

||Eg(F FMy,)||g^2 Y CqAdy/w(^X() q jX]\Xw+l,q,k Y ^M'W+l,qJ^0,q,k') 

< CqMV^{Mw)-%,g (27) 


for some constant 4„ > 0. By Markov’s inequality, for a; > 0, 


g(T — Tmw)\ P x) < 


4,M''/2-«a/2y;a/4-ag/2^9/2 


X' 


q/2 


(28) 


By the same argument for proving (27), we have 


||Eg(K,,;)||,/2 < CqMy/fdr^^iq^, 
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Let c = g/4 — 1 — ag/2, A; = 3/ ^ if 1 < / < L/2 and A; = 3(L + 1 — /) ^ if 

L/2 < I < L. Then < 1- Nagaev (1979) ineqnality, it follows that 

L L 

P(| 5 ^Eo(K,, 0 I >^) < 5 ^P(|Eo(K,,z)| > A;x) 


i=i 


1=1 

L 

1=1 


^ C 2 (Azx)V; 2 “ 


< 


(AiX)9/2 

C'3W;M'?/2-ag/2^9/2 L 




exp 


1=1 


wM^a,a 


Xql2 


7772 + *^4 ^g,a(A 


1=1 \ 


iwfx) 


1=1 


Elementary calculations show that 

L 


< C 5 for c < 0 and < CqZuI = Cqw'^ for c > 0 . 


1=1 A; 

Furthermore, we can use (57) to obtain 


1=1 A; 


(29) 


(30) 


y^Eq^a(A;Z^fx) < Exa{x). 


1=1 


(31) 


Putting (26), (28), (29), (30) and (31) together, we then have (25). 

Now it suffices to consider P(|Eo(Tm)| > x). Observe that (Eftj_MPbA:,M)fe is odd are inde¬ 
pendent and so are {Ybj^MYhk,M)b is even- By Corollary 1.7 of Nagaev (1979), for any J > 1, 

'E”.i IIEo(nj.MW)ii«/^' ’ 

^ bj,M J-bk,M )\ ^ -^z w ^ I - 

b=l \ 


'o(Tm)\ > x) < ''^^f‘{\^oiYbj^MYbk,M)\ — ^/(2'^)) +2 


Jx5/2 


-|-4exp 


Note that llg < CgVM\\X. 


CnX"^ 


IIEo(n,,Arnfc,M) 


i II 9,0- 


Hence for 1 <h < w, 1 < j,k < p and O' > 4, 


||Eo(Tfej^MLfefc,in)||g/2 < 2||F6j^ML6fc,M||g/2 < 2||Fbj^M||g||PbA:,M||g < C'gMH Xj || || A7.fc || q,0- 


Since 


^\Ybj,MYbk,M\ < ||LfeyM|| 2 ||Lbfc,M ||2 < A7||X.j || 2 , 01 | A7.fc || 2,0 < 


X 
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we have 


W 

P(|Eo(Tm)| > x) < ^P(|hfej,Mh6fc,M| > a^/(4J)) 

b=l 

\ Jx9/2 


J 

+ 4 exp 


( C,x^ \ 

V wM^^i J ■ 


Recall that M = Oin"^) with 0 < <^ < 1. Let J = 1 + (2g — 2){q — 4)“^(1 — ^)“L Since 
X > \/u;M||Xj||q,o||-^ A:||5,05 elementary calculations show that for sufficiently large n the 
second term in the above expression is no greater than C'jtcM||Xj||q^Q^||X.fc||g^Q^/x'^/^. As 
for the first term, we have 


n\yb,,MYbk,M\ > ^/(4^)) < P(|n,,M| > ^/x|{AJ))+¥{\Y,k,M\ > vWM)- 


By Lemma 5.2, for a>l/2 — 1/g and a<l/2 — 1/g, respectively, we have 

Cq^aX '^'^^M||X.j||^ ,3, + Cq^a GXp J4\\X~\\^^ 
Cq,^x-y^My^-^^X.q\\l^ + C',,„exp ' 


P(|Lfei,Ar| > Vx) < 


A similar inequality holds for F{\Yi,k,M I > ^/x). Let (pq^a = \\X.j\\l^ + ||X.fc||5^. 
follows that fora>l /2 — 1/g and q;<1/2 — 1/g respectively, 

Cq^^x-y^wMcpq^^ + Cg,c.exp (- 


o{Tm)\ ^ x) < 




Cg^aX + C'g,« SXp 

n /9 

Combining (25) and (32), and noticing that Q'a < CqCpq^a, it follows that 

^-‘0{T)\ >X)< Cq^aX~y‘^Fa(pq,a + CXp 


Cq,aX'^ 


Cq^gX"^ 

which implies (24) by the Bonferroni inequality by summing over j and k. 

Under stronger moment conditions, we can have an exponential inequality. 
Theorem 4.2. Assume < 00 for some u > 0. Then for all x > 0, we have 


{n\diag(T,) — Kdiag(T,)\oo > x) < pexp 


X ’ 




(n|E - EE|oo > a:) < exp 


X ' 


where 7 = 1/(1 + 2v) and the constants in < only depend on u. 


Hence, it 


(32) 


□ 


(33) 

(34) 
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Proof. Let T = Y^jY^k- By the Burkholder inequality, we have 


wM wM / w 

\\^nli 2 < (9/2 - 1 ) 5 ; \\v‘T\\%^ < (9/2 -1)5] 5; l|■p‘n,n,||,/, I ( 35 ) 

/=—00 l=—oo \6=1 / 

By Theorem 3 in Wu (2011), \\Ybj\\g < {q - lf/‘^y/M\\X.j\\q^Q. Since \\Y^k - ^6fc,{z}||g < 

E hM r 1 

we have 


II g/2 


6=1 


b=i 


< + llUj - W 


6=1 


/ wM wM 

< {q — 1)B2a/M I \\X.j\\qfi'^2^h-l,q,k + |||| g ,0 ^ j ) , 

h=l h=l 




which by (35) implies that 


wM 


IIEoTIIJ/j < ( 9/2 - 1) 5 ; \\V‘T\\li, <(q- 2)(q - 1)u’M2||X,||1„||A'.i| 


2 

<?,o- 


(36) 


Let Rjk = EqT/ {\/wM). Similarly as the argument for proving Lemma 5.3, if 7 /z > 2, it fol¬ 
lows that ||i?jfc|l 7 /i < (27^-l)(27^)^''ll^-j||p..o||^-fclU.,o- Let To = ( 2 e 7 ||X.j||;^^_o||Xfc||^^ o)-L 
Notice that —2z/ = 1 — 1 / 7 . Then 

^ t'-(27A - l)"'‘(27A)^-^'‘||X,||i;„||Xt||;7 

h\ ~ Ciih/e)^a'^^ 

^ aht’^{2^h - ^ aht'^ 

- C,T^{2^h)^^ - Cr^T^- 

If 'jh < 2 , then \\Rjk\Uh < WRjkh < a /6 ■ 42 ''||Xj||,/,^,o||-^-A:|l 7 ,,o- So we have 


E[exp(f/2JJ] <1+5^ 

1<Zi<2/7 


C(v'6.4=7|X,||<..,„||Xt||*,.„)’'‘ 


ft! 


E 

h>2l'y 


tthP 


CiA/er/ 


fh j. 

< 1 + C.y'^Oh^ < I + C^jy— 


h=l 


'{l-t/roYR- 


By choosing t = ro/2, and applying the Markov inequality and the Bonferroni inequality, 
(33) and (34) are obtained. □ 
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Remark 2. An alternative estimate of S, which also works with nnknown mean EX*, is 



(37) 


6=1 




where X = {wM)~^ = \n/M\. Then |S — S|oo = M|X|^. Applying Lemma 

5.2 to 0^6 can conclnde that Theorems 4.1 and 4.2 still hold for E with EE 

therein replaced by E^ := ~ (which eqnals to EE if EXj = 0). 

Corollary 4.3. (i) Under conditions in Theorem 4-1, we have |E — E|oo = Op(r„), where 



r^ = n ^ max 


where v{M) = 1/M if a > 1, v{M) = logM/M if a = 1 and v{M) = 1/M“ if 0 < 
a < 1. (a) Under conditions in Theorem 4-2, we have |S — E|oo = Op(r„) with = 
n-^^/wM^l^Q{\ogpy/'^ + 2,0'^ 2,av{M). 

The above Corollary easily follows from Theorems 4.1 and 4.2 since the bias |Ejvr — 
E|oo ^ lE' 2 ,od' 2 ,a^^(Af); see the proof of Lemma 6.3. 

For the estimate S in (37), let Dq = [diag(S)]^/^. Let Z = where rj ~ X(0,ldp) 

is independent of (Xj)j. Then conditioning on (X*)*, Z ~ X(0,E). Let 0 < 0 < 1; let 
Xe be the conditional 6*-qnantile of |I)q given We can nse xe to estimate 

the 6'-quantile of [Dq ^(X„ — p)|oo, thns constrncting simnltaneons conhdence intervals for 
fi = (/ii,..., as fij ± 1 < j < p. Assnme that r„ = o(l/ log^p). Then 7r(|E — 

S|oo) = o(l), and by Lemma 3.1 in Chernozhnkov et al. (2013a), the latter simnltaneons 
conhdence intervals have the asymptotically correct coverage probability 6. Note that xe 
can be obtained by sample qnantile estimates from extensive simnlations of Z = 

5 Tail probability inequalities under dependence 

Tail probability ineqnalities play an important role in simnltaneons inference. Here we 
shall provide some Nagaev-type tail probability ineqnalities. They are of independent 
interest. Let ei,et,i,j, G Z, be i.i.d. random variables. We start with the one-dimensional 
stationary process {ei)/Z-oo of form 


Ci P(- • • ) ^i—ly S'*), 


(38) 
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where g is a. measurable function such that Cj is well-defined. Recall = {si, £i+i ,... ,ej), 
= (• • • aiid Ti = (£j,ei+i,...). Let the projection operators V^- = E(-|J^°) — 

Vq- = E(-|J^q) — E(-|J^j). As in (5), define respectively the functional and the 
predictive dependence measures 

<5,,, = ||ei - = ||iP°e,||„ and 0',^ = llPoe^l,, 

where = (• • • Let 6i^q = 0 if i < 0; let Am,q = m > 0, 

be the tail dependence measures, and the dependence adjusted norm 

||e.||g,a := sup(m -|- l)"Am,g, for a > 0. (39) 

m>0 

Here 6i^q measures the dependence of e* on £o and Am,q measures the cumulative impact of 
Sq on The projections (V-i-)i^z and (V^-)i£z induces martingale differences with 

respect to and respectively. Both predictive dependence measures provide an 

evaluation to the effect on the prediction of e* when part of the previous inputs is concealed, 
and they satisfy 9i^q < 6i^q and 0'^ < Si^q in view of Jensen’s inequality. 


5.1 Inequalities with Finite Polynomial Moments 

For m > 0, the m-dependence approximation of Cj is denoted by where 


Ci,m ■ ■ ■ 5 ^i)' 


Let Sn = Sn,m = With the dependence adjusted norm (39), we are 

able to provide tail probability inequalities for error bounds when approximating (e*) by 
the m-dependent process (ei^m)- In lemmas below the constant Cq^a only depends on q and 
a and its values may change from line to line. 


Lemma 5.1. Assume ||e.||q_a < oo, where q > 2 and a > 0. (i) If a > 1/2 — 1/q, then 

(40) 


C II'? / n ^2 2a 

p(|5. - ^.,^1 > x) < ---Ukh + Cq,^ exp ^ 




’^lle-llgc 


holds for all X > 0 and 1 < m < n. (ii) If 0 < a < 1/2 — 1/q, we have 


>(r) < 




q,OL 


Xl 


+ Cq^a exp 


Cq^aX‘^na?°‘ 


^l|e.|lv 


(41) 
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Proof of Lemma 5.1. It is a special case of Lemma 5.7 for p = 1. □ 

Lemma 5.2 (cf. Theorem 2 of Wu and Wu (2015)). Assume that ||e.< oo, where 
q > 2 and a>0. (i) If a > 1/2 — 1/q, then there exists some constant Cq^a depending on 
q and a only such that, for x > 0, 

. r)\\f=’ 7*2 \ 

\ (42) 


P(|^n| >X) < 


C nWe ll'J 

+a „ exp 




n\\e.\\2,c 


(a) If 0 < a < 1/2 — 1/q, we have the following inequality, 

Cg,„n5/2-aq||e ||g 


P(|^n| >:r) < 


11^,0: 




+ Cq^a exp 


Cq^aX"^ 


n\\e.\ 


2,0 


(43) 


Remark 3. By Markov’s inequality and Lemma 1 of Liu and Wu (2010), one obtains 

\\Sn — Snm\\n || 6 . || „ 

n\Sn - Sn,m\ > x) < " < Cq -(44) 

In comparison, the polynomial tail bounds in (42) and (43) are sharper. 


5.2 Inequalities with Finite Exponential Moments 

If ei satisfies stronger moment condition than the existence of finite g-th moment, we can 
have an exponential inequality. We shall assume ||e.||g^a < cx) for all g > 0 and some a > 0 
and we further assume for some z/ > 0, the dependence adjusted sub-exponential norm 


||e.||p,,a := supg ''||e.||q,„ < oo. (45) 

q>2 

By this dehnition, if are i.i.d., ||e.||^^_Q, reduces to the sub-Gaussian norm {u = 1) or 
sub-exponential norm (z/ = 1/2) of the random variable by the equivalence of ||e.||g_Q, and 
||ei||q. The parameter u measures how fast ||e.||g,Q, increases with g. 


Lemma 5.3. Assume (45). Let .In = {Sn — Sn,m)/\/n and (3 = 2/{1 + 2z/). Then 
hit) := supE[exp(f jf)] < 1 -f- G^(l - t/to)~^^‘^t/to 

neN 

holds for 0 < t < to with to = m"^/(e/9||e.||^^ ^). Consequently, letting t = to/2, for x > 0, 


P x) < expi—tx^)hit) < Cg exp 


xhfYi'^h 


2e/9||e.| 


(46) 
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Lemma 5.4 (cf. Theorem 3 of Wu and Wu (2015)). Assume (45) holds for a = 0. Let 
(3 = 2/{l+ 2i>). Then for x>0, 

F{\Sn/Vn\>x)<C0exp(- . [. (47) 

V 2e/3||e.||;_^oy 

Proof of Lemma 5.3. Let Qn^i = ^ — 0- Then is a backward martingale. 

By Burkholder’s inequality, we have 

n 

IIQ„,.II? < (9 - 1) ^ ll-Pi-.VIIJ = {q- l)n(9;,,)^ 

i=l 

By 9[q < 6i^q, we have ||Jn||q < {q - in view of y/nJn = Write 

the negative binomial expansion (1 — s)“B2 = i -|- with = (2A;)!/(2^^(A;!)^) 

for |s| < 1. By Stirling’s formula, we have ~ (fevr)"^/^ as A; —)■ oo. Hence, there exists 
absolute constants Ci,C 2 > 0 such that for all A: > 1, 


C'i(A;/e)'’afci < A;! < C'2(A;/e)'’afck 


(48) 


Under condition (45), if k[3 > 2, then ||e.||^fc^a A \\^-\\i>^,a{.l3kY and hence 

wawt ^ a.tHPk - ^ a,tt 

k\ ~ Cx{k/eYafY ~ Cit(^{(3kY^P ~ Ciy/et^ 

If kft < 2, then \\Jn\\pk < ||Tni |2 < 2‘'m“"||e.||^^,„. In l^t y 

hit) < 

< 

where Cy > t) only depends on (3. So (46) follows by Markov’s inequality. 


1+ E 

l<fc<2//3 




k\ 


E 

k>2ll3 


akt 


Ciy/etl 


oo 


1 < 1 + ^374-^ 


t/to 


k=l 


(1-A/Ao)B2’ 


tjh, then 


□ 


5.3 Inequalities for High-dimensional Time Series with Finite 
Polynomial Moments 

In this section we shall derive powerful tail probability inequalities for high-dimensional 
stationary vectors; cf Lemmas 5.7 and 5.8. The proofs require Theorem 4.1 of Pinelis 
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(1994), a deep Rosenthal-Burkholder type bound on moments of Banach-spaced martin¬ 
gales. Lemma 5.5 follows from Theorem 4.1 of Pinelis (1994). Lemma 5.6 is a Fuk-Magaev 
type inequality for the sum of independent random vectors. For a p-dimensional vector 
V = (ui ,... ,Vp) recall the s-length |u|s = (X]j=i s > 1. 

Lemma 5.5. Let Di, 1 < i < n, be p-dimensional martingale difference vectors with 
respect to the a-field Qi. Let s > 1 and q >2. Then 

lllDl -h . . . -h DnlsWq < C < g|| sup |AM|q + \/g(s “ 1) 

I * 

where c is an absolute constant. 

Lemma 5.6. Assume s > 1. Let Xi,..., be p-dimensional independent random vectors 
with mean zero such that for some g > 2, || |XA||q < oo, 1 < i < n. Let Tn = A and 
cTi = (liXiilh, • • •, llAplh)"^- Then for any y > t), 

P (lAl, > 2E| Al, + y)< C,y-<^ E|A|^ + exp - ^ , (49) 

where Cq is a positive constant only depending on q. 

Proof of Lemma 5.6. For s > 1, we apply Theorem 3.1 of Einmahl and Li (2008) with the 
Banach space {ML, | ■ |s) and rj = 6 = 1. The unit ball of the dual of (M^, | • |s) is the set 
of linear functions {u = {ui,... ,Up)~^ h-)■ X~^u : X G |A|a < 1} where 1/a -|- 1/s = 1. By 
Minkowski’s and Holder’s inequalities, we have 

p 

||A^A||2<5^|A,M|A,||2< lAUIaA. 

1=1 

Hence, the A„ therein is bounded by 

Let Xi be a mean zero p-dimensional stationary process, and Tn = Tn^m = 

Xi^rn where Xi^m = E(Aki-m, • • • Xi)- AVe are interested in bounding the tail prob¬ 
abilities of P(|T„ — Tn,m\oo > x) and P(|T„|oo > x) for large x. Wrtie ^ = i{p) = 1 V logp. 
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Lemma 5.7. Assume |||X|oo||g,a < oo, where q > 2 and a > 0. Also assume < oo. 
(i) If a> 1/2 - l/q, then for x > 


P(|T„ - Tn,m\oo >X)< 






',Q; I 

-h Cq^a exp 


Cq^aX‘^m‘^°‘ 


n^o 


2.a 


holds for all 1 <m < n. (ii) If 0 < a < 1/2 — l/q, the inequality is 


P(|T„ - T„,m|oo > x) < 




x'^ 


+ Cq^a exp 


■^q,cx^ 

n'iL 


(50) 


(51) 


Proof of Lemma 5.1. Let s = £ = 1 V logp. Then P(|T„ — Tn,m\oo > x) is equivalent to 
P(|T„ — Tn^rn\s > x), siuce for any vector v = (ui,... ^VpY, |n|oo < |'y|s < p^'^^|n|oo- Let 
L = [(logn — logm)/(fog2)J, zu; = 2^1 1 < / < L, wl = \n/m\ and ti = m ■ wi for 
1 < I < L, To = m, tl = n. Define Mn,i = Tn^n — Tn,Ti_i ior 1 < I < L and write 

L 


Tn — Tn,m — Tn — Tn^n + Mn,l- 

1=1 

OO 

Notice that Tn — Tn,n = Yj L/j+i — Tnj- By Lemma 5.5, 


(52) 


j=n 
oo 


\\\Tn - Tn,n\s\\q < ^ \\\Tn,j+l “ Tnj\s\\q < ^ Cq{nsY^‘^Uj+l^q = Cq{nSy^‘^Qn+l,q, 


j=n 


j=n 


where Cq is a constant only depending on q. By Markov’s inequality, we have 


P(|T„ - TnYs >X)< 
For each 1 < / < L, define 


\\\Tn-TnM\l ^ Cq{ns)<^/^nl 


Xl 


'n+l,q 


xi 


(63) 


(m)An 

foi,z = ^ {Xk,Ti - Xk,Ti_i) , for 1 < i < [n/pj; 

fc=(i-l)r; + l 


K.i = 


Y, Yi, and K.I = Y 


I IS even 


i is odd 


Let c = q/2 — 1 — aq] let Ai, A 2 , ■ ■ ■ , Al be a positive sequence such that Jf,iLi < I; 
specifically. A; = /“^/(vr^/S) if 1 < / < L/2 and A; = (L + 1 — /)“^/(7r^/3) if L/2 < I < L. 
Since Yi^i and Yi/^i are independent for \i — i'\ > 1, by Lemma 5.6, for any a; > 0, 


n\Rl,\,-2E\RU,>\x)< 


c, E Einiii 


i is even 


(Azx)- 


+ exp 


YixY 


3 Z] Wy„ 
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where aya = • • ■, By Lemma 5 . 5 , || < Cg(ris)^/^a;/,g where 

^I,g = Y.k=n_i+i^k,q < T-,l“|||X|oo||g,a- For 1 < j < p, by the Bulkholder inequality, 
\\yij,ih < where 61^2,j = Efc=r,_i+i^ v"||Xj||2,a, which implies |ay.,z 

rF27-^y"v]/2 So we obtain 


< 
S rsj 


n ^ og /2 .p'j/2 1^9 

F{\Rl,U - 2E\Rl,U > + exp 


C 2 {Xixf 


Xl 


A? 


(54) 


By Lemma 8 of Chernozhukov et al. (2014), for s = logp V 1, 

^Rn,l\s < \/^Ti~Jl'^2,a + n^’^sCjl^q < [xA?^4'2,a + || |X | oo || q,a]m“"Z47, 

Notice that min;>o > 0. Hence, E\R^i\s ^ Xix and (54) implies 


-a„-a 

I ■ 


(55) 


n\K,i\s > Xix) < 


Cins^^ 


C 2 {Xixf 


X'i X\ 

A similar inequality holds for Therefore, 


+ exp 


'Y^Mn,l\s>x) < {\Mn,l\s > XiX) 

1=1 1=1 

L L 

£ E*’(l^tiL £ A,i/ 2) +^p(|fl;,|^ > A,1/2 


1=1 

L 


< 




C 2 {Xixf 




1=1 ‘ 1=1 

L 




C'snm'^s'^/^ll |X. loollg Q 

=1 * 




, ^exp^- -- (^6) 

1=1 ‘ 1=1 A 2,0; / 

By the definition of wi and A; and by some elementary calculation, there exists some 
constant C^> 1 such that for alH > 1, 

L 

^exp(—CsfAfztJ;^") < Ce exp(—Csf/r), (57) 

1=1 

where p = min;>i A^zzJj^" >0. If c > 0, it can be obtained that 'Yld=i'^i /^1, — ^7^1 — 
Cjn^/m'^. If c < 0, then J2f=i ^F/Af < Cs- Hence, combining (52), (53), (56), (57), Lemma 
5.7 follows. □ 
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Lemma 5.8. Assume |||X|oo||g,a < oo, where q > 2 and a >0. Also assume < oo. 
(i) If a> 1/2 - l/q, then for x > \frd'^ 2 ,o, + 


P(|Tn|oo >X)< 


xi 


+ Cq^a exp 


Cq^gX'^ 
' n^g 


(58) 


(ii) If 0 < a < 1/2 — l/q, we have the following inequality, 


Pd^nloo > a:) < 


Cq,gn^l^-^H^I^\XU\\l 


q,(y 


Xl 


+ Cq^g exp 


Cq,gX^ 


nn,a 


(59) 


Proof of Lemma 5.8. The proof is similar to that of Lemma 5.7, and thus is omitted. □ 


6 Proofs 


6.1 Proof of Theorems 3.2 and 3.3 

We shall apply the m-dependence approximation approach. For m > 0, define 

Xi^rn (^XiiqYi, • • • 1 X^p jYi) ^(^X/Si—mi ^i—m+li • • • i ^i)- (^0) 

Write Tx = X)r=i Tx,m = Xi^m- For simplicity, suppose n = (M + m)w, 

where M ^ m and M,m,w —)■ cx) (to be determined) as n —>■ cx). We apply the block 
technique and split the interval [1, n] into alternating large blocks Li, = [{b — 1){M + m) + 
1, bM + {b — l)m] and small blocks Sb = [bM + {b — l)m + 1, b{M + m)], 1 < 6 < tc. Let 

W W 

Ffe = Xi, Yb^rn = Xi^rn^ Ty = W, Tv^m = W.m- 

iGLi) 


iGLh 


b=l 


6=1 


Let Zb, 1 <b <w,he i.i.d. N{0,MB) and Zb^m be i.i.d. N{0,MB), where the covariance 
matrices B and B are respectively given by 


B = ibij)lj=i = Cov{Yb/\fM) and B = (%)fj=i = CoxiYb^ra/^/M). 
Write Tz^m = DLi and let Z ~ iV(0, S). 


(61) 


Lemma 6.1. (i) Assume Qg^g < oo for some q > 2 and a > 0. Then there exists some 
constant Cg^g such that for y > 0 


Fi\Tx - Ty,™|oo >y)< my) + my) =■ riy) 


(62) 
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where the constant in < only depends on q and a, 


fiiy) = 


y ^ “^Qg,a +pexp 

^-g^g/2-ag0g^^ _|_ p gxp 


Cq^aV rn^ 

2,a 


, a > 1/2 — 1/q 
a < 1/2-1/q 


and 


H {y) = 


a >1/2- 1/q 
a <1/2- 1/q 


(63) 


(64) 


y '^wmOl^ + pexp , 

l/-‘'(wm)‘'/2-«909^ +pexp 

(ii) Assume < oo for some u > 0 and a > 0. Let /3 = 2/{1 + 2v). Then there exists 
a constant Cy > 0 such that for y > 0, 


PdTx - r,-,„U >y)< ft(y) + fUy) =: 


(65) 


where the constant in < only depends on ft and a, 

iMri)] ^(Tisk:;)'} ■ 

Proof. Let Pi = P(|Tx - Tx,m\oc, > y/‘2) and P 2 = ¥{\Tx,m - TY,m\oo > y/2). Lemmas 5.1 
and 5.7 imply that Pi < f/{y). Write Tx,m - Ty^m = Ylb=i By Lemmas 5.2 

and 5.8, we also have P 2 < f^iy)- Hence both cases with a > 1/2 — 1/q and a < 1/2 — 1/g 
of Lemma 6.1(i) follow in view of P(|Tx — Ly^mloo > 2 /) < Hi + P 2 . 

The exponential moment case (ii) similarly follows from Pi <//(k andPa </2(2/)- □ 

Lemma 6.2. Let D = {dij)^^^.^ be a diagonal matrix. Assume that there exist constants 
c > 0, ca > Cl > 0 such that c < mmi<j<pdjj and ci < bjj/djj < ca for all 1 < j < p. 
Assume < 00 for some g > 4. Then for all X G (0,1), 

snp |P(|P“P^Ty,m/A/h|oo <t)- F{\D~'^/‘^Tz,m/Vn\oc < t)\ 

teM. 

< W“P®(^3!o^ V T4(o^)(log(pM;/A))P® + W~^/‘^{log{pw/X)f^‘^Um{X) + A 
. h(A,'Um(A)), 


where the constant in < depends on c, ci, ca, and q and a for (i), and /3 for (ii) below, and 
Um{X) < u*^{X) in (i), and Um{X) < m^(A) in (ii). 


22 









(i) Assume Qq^a < oo for some g > 4 and a > 0, then 

= 

(ii) Assume < oo for some u > 0. Then 


max{0q^a(A 4'2,a'\/log(pw/A)}, 

max{0q,„(A“^w)^/'^M"“, V'log^pwT^}, 


a > 1/2 - 1/g 
a < 1/2 - l/q. 


( 66 ) 


= max{<l)^„,o(log(pw/A))^/^, Vlog(pw/A)}. (67) 

Froo/. For 1 < / < g, define Ri = maxi<j<p Since = Y.'^=o'T^i-kXij, 

by Burkholder’s inequality (Burkholder (1973)), 

M M 

IIE < a ^ < qm{o'u.A’ 

i=l i=l 

then we have 

M m M 

II ^^Xjj^mllz < II (68) 

i=l k=0 i=l 

which implies Ri < For 0 < A < 1 and the diagonal matrix D = define 

UY,m{^) as the infimum over all numbers m > 0 such that 

P(|M“F2g^-V2y-^^^^| < 1 < 5 < 1 < j < p) > 1 _ y. 


Also define uz,m{^) by the corresponding quantity for the analogue Gaussian case, namely 
with replaced by Zb^m in the above definition. Let Um{\) ■= UY,mW V uz,mi^)- By 
Theorem 2.2 of Chernozhukov et ah (2013a), for all A G (0,1), 

sup <t)- F{\D~^^‘^Tz,m/'/n\oc < t)| 

iSM 

< m;“F8('^3/4 ^ J{lR'^(log(^pyj/X-jy/» + w~^^‘^{log{pw/X)y/‘^Um{X) + A, 


Now we shall find a bound on the function Um{X). (i) By Lemmas 5.2 and 5.8, we have 

P(|M"F2g^-V2y^^.^| ^ ^ ^ P(|M"F2y^^^|^ > C^^y) 

^ j Cq^^u-'^wM^-i/^Ql^ + Gq,„pwexp (-%f5) ’ ol>\/ 2- Ifq 
~ I Cq^c.u~^wM-°‘'iei^ + Cg^aPwexp (-^f5) ’ a <1/2- l/q 
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This implies ■uy,m(A) < Cq^a \I/2,a'\/log(pw/A)} if a > 1/2 — 

l/q and Uy^mi^) < max{0q T2,Q,A/log(pta/A)} if a < 1/2 — 1/q. For 

uz,m{^), since ~ we have E(exp{M“^Z^^-,^/(46gj)}) < C. Hence 

W p 

Zi,j^rn\ > u for some b,j) < EE F{\M-^/^Zbj,m\ > d]^^u) 

b=i j=i 

< Cpw exp(—djjU^/(4bjj)). (69) 

With the assnmption ci < bjj/djj < C2, uz,m{^) E C-^log(ptc/A). 

(ii) By Bonferroni ineqnality and Lemma 5.4, 

P(|M“E2c^-V2y^^.^| ^ ^ ^ C^pwexp I -C'/3^— I , (70) 

I W.,0 J 

where /? = 2/(1 + 2n) and (F/? is a constant that depends on fd only. Combining (69) and 
(70), it follows that Um{^) < C*/? max{<F.0^p(log(pta/A))^/^, /\)}. □ 

Now we consider the comparison between Z and Tz,m- Let 7r(a;) = V log(p/a;))^/^ 

for X > 0. 

Lemma 6.3. Assume 4/2,0 < oo for some a > 0. Let D = {dij)^j^i be a diagonal matrix 
such that there exist some constants 0 < Ci < C 2 such that C\ < Cjj/djj < C 2 for all 
1 E J E P- Then we have 

sup |P(|Z1-E2 t^_^/^|^ <t)- P(|Z1-E2 z|^ < t)| 
teM 

< 7r(max dfl4)2,a4>2fl{m~°' + v{M)) +M;m/?7.), 

i<j<P 

where v{M) is the same as defined in Corollary f.3. 

Proof. By the definition of Tz^m and Z and (61), 

:= Cov{D-^/^Tzm/Vn) = 

’ n 

:= Cov(L)"E 2^) ^ ^_i/2^^-i/2_ 

Let Sm] = SMj,m = J2iLi By the moment inequality in Wu (2005), 

||EA4-j||2 E IIEMymIh E and WSmj — ^Mpmlh E Note 
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that bjk = M ^E{SMjSMk) and bjk = M ^E{SMj,mSMk,m)- Then 

\bjk ^jk\ l^i^SMjSMk SMj,mSMk,m)\ 

~ M ~ SMk,m\\2 + ||*S'Mfc,m||2 ' ||>S'mj ” SMj,m\\2) 

< 2d^2,aT2,om"“. 

Recall that ajk = Zlz^-oo and 

M 

bjk = M-^E{SMjSMk) = M-i (M - |/|)7,fc(0- 

It follows that 


l=-M 


M 


(yjk-bjk= ^7jfc(/) + M ^ ^ |^|7ifc(0- 

|«|>M 


l=-M 


By Xij = Ylh=a'^" have 

OO OO OO 

|72fc(0l = l5^E[(p-'^Xo,)(P-"X,,)]| < 5^|E[(P-"X0,)(P-"X;,)]| < J26k,2,A+i,2,k. 


h=0 


h=0 


h=0 


Hence, it can be obtained that 


Y1 

\1\>M 


OO OO 


— 2 |7jA:(0l — 2 'y^^bh,2,jbh+l,2,k ^ 2Ao^2j2\M+l,2,fc; 

/=M+1 i=M+l /i=0 


and 


M 


M 




l=-M 


M M OO 


M 


- M E E E E 


/=1 z-=/c /i=0 


M 


2,fc- 


i=l 


Since Ao, 2 ,j < ^ 2,0 and Am, 2 ,j < ^ 2 ,am ", maxi<j-fc<p |6jfc - ajk\ < T 2 ,aT 2 ,o^^(Tf). Hence, 


^Z,m 


< max d'/dH - B\^ + \B - E|oo) + (1 - Mw/n)\D-^/^J:D-^/^\ 


^<j<p 


n 

7-1 


< max d,j'^ 2 ,a'^ 2 fl{m " + v{M)) + C 2 wm/n. 


1<7<P 


By Theorem 2 of Chernozhnkov et al. (2014), the resnlt follows. 


□ 
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Theorem 6.4. Let Sq be the diagonal matrix of the long run covariance matrix S and 
Do = Let Assumption 3.1 be satisfied, (i) Assume that Qq^a < oo holds with some 

q > 4: and a > 0. Then for every X G (0,1) and rj > 0, 

pn := sup\F{\Do^Tx/Vn\oo <t)-F{\Do^Z\oo <t)\ 

tSM 

< f*{y/n7]) + rj^/logp + h{\, u*^{X)) + 7r(4'2.a^2,o("i"" + v{M)) + wm/n). (71) 
(a) Assume < oo for some u > 0 and a > 0. Then for every X G (0,1) and rj > 0, 

Pn := sup |P(|Zlo ^Tx/\/n|oo < t) - Pd-DcT^^loo < ^)| 

teiR 

< f{y/np) + p^/\ogp + h{X, U^(A)) + 7r(4'2,a^2,o("i"" + v{M)) + wm/n). (72) 

Proof, (i) By Lemma 6.2 (i) and Lemma 6.3, we have for every A G (0,1), 

sup \F{\Do^TY,m/y/n\oo < t) -P(|L>o^^loo < ^)| 
teM 

< h{X,u*^{X))+ Tr{4l 2 ,a4l 2 ,+ v{M)) + wm/n). (73) 

Observe that each component of the Gaussian vector D/f^Z has variance 1. By Theorem 
3 of Chernozhukov et ah (2014), for every p > t), 

supP(||L>o ^^loo -t\<p)< p^/logp. (74) 

teK 

By the triangle inequality, for every r/ > 0, we have 

sup |P(|L >0 ^Tx/a/uIoo >t)- F{\Do^TY,m/\/n\oo > ^)| 

teM 

< F{\Do^{/Tx-TY,m)/Vn\oo> p)+^^pF{\\Do^TY,m/Vn\oc,-t\ <p), 

tSK 

which implies Theorem 6.4 (i) in view of Lemma 6.1 (i), (73) and (74). 

(ii) Inequality (72) can be obtained by replacing f* and with and in the 

above proof. □ 


6.2 Proof of Theorem 3.2 

Proof. Recall (62) for /*(■)• By Theorem 6.4, for a > 1/2 — 1/g, to have (7), we need 

7r(4/2,0.4/2,o(^~“ + v{M)) + wm/n) —)■ 0 (75) 
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and for some rj > 0 and A G (0,1), 


(76) 

(77) 


+ 17 ,/logp -> 0 , 

6 . 

Firstly, (75) reqnires m 3> L 2 , wm n(logp)“^, w <C n(logp)“^(\E' 2 ,ad^ 2 ,o)~^ if a > 1 and 
w n/L 2 if 0 < a < 1. Moreover, (76) reqnires m 3> max(Li, (tl/ 2 ,a logp)^/“) and wm 

min(A7i, 772 )- And (77) needs (8) and w ^ max(fFi, 1F2)- We also need M x n/w 3> m. 
Notice that (\& 2 ,a logp)^/" < L 2 , N 2 < n(logp)“^ and N 2 < n{\ogp)~‘^{'^ 2 ,a'^ 2 ,o)~^- If 

max(Li, L 2 ) max(fFi, IF 2 ) = o(l) min(? 7 ,, Ni, N 2 ), (78) 

then we can always choose m and w snch that (7) holds. Observe that N 2 ^ n, then (78) 
is reduced to (9). 

For 0 < a < 1/2 — 1/q, the function f* in (76) is replaced by P (cf. (65)), which 
implies 0 g^Q,(logp)^A = 0 ( 77 ,“), m 3> (\I' 2 ,a logp)^'^“ and wm -C mm(N 2 , N^). And in 
(77) is replaced by u^, implying w 3> max(hFi, hF 2 , hFs). By the similar argument, if (10) 
is further assumed, then (7) also holds for the case 0<a<l/2 — 1/g. □ 

Remark 4. In the proof of Theorem 3.2, we exclude the case a = 1 when a > 1/2 — 1/q. 
If a = 1, we need to impose the additional assumption 

max(hFi, IF 2 ) = o(?^/(A 2 logn)) (79) 

to ensure (75). The above condition is very mild since (9) implies max(hFi, hF 2 ) = o{n/L 2 ). 
If logn < (logp)^4/| which trivially holds in the high-dimensional case p with some 
K > 0, we have N 2 = 0{n/ logn) and hence (9) implies (79). Similarly, it is further 
assumed max(hFi, IT 4 ) = o(n/(L 2 logn)) in Theorem 3.3 if a = 1. 
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